Levenshtein transducer

In computer science, Levenshtein automata for a formal language are the family of finite state automata that can recognize the set V of all words in the language for which the Levenshtein distance to an arbitrary word w does not exceed a particular constant. Levenshtein transducers are extended Levenshtein automata that can in addition generate all strings in V at a fixed Levenshtein distance from a given w.

A Levenshtein automaton for W can be constructed in linear time with respect to the length of W, and can identify V in less time than would be needed if the Levenshtein distance to W was calculated for each word in the language (a problem with quadratic time complexity).

Since Levenshtein automata are finite-state machines, they can be described in (some) regular expression frameworks and finite-state algebra applies to them. In particular, Levenshtein transducers can be composed with other finite-state transducers.

Experimental implementations of the Levenshtein automaton exist in Python[1][2] and Java[3]

Footnotes

  1. ^ Barrette-Lapierre, Jean-Philippe. Moman. May 14, 2009. Retrieved on 2010-10-29.
  2. ^ Johnson, Nick. Damn Cool Algorithms: Levenshtein Automata. July 28, 2010. Retrieved on 2010-10-29.
  3. ^ McCandless, Mike. Lucene's FuzzyQuery is 100 times faster in 4.0. March 24, 2011. Retrieved on 2011-06-23.

External links